Using Collaborative Tagging for Text Classification: From Text Classification to Opinion Mining
نویسندگان
چکیده
Numerous initiatives have allowed users to share knowledge or opinions using collaborative platforms. In most cases, the users provide a textual description of their knowledge, following very limited or no constraints. Here, we tackle the classification of documents written in such an environment. As a use case, our study is made in the context of text mining evaluation campaign material, related to the classification of cooking recipes tagged by users from a collaborative website. This context makes some of the corpus specificities difficult to model for machine-learning-based systems and keyword or lexical-based systems. In particular, different authors might have different opinions on how to classify a given document. The systems presented hereafter were submitted to the DÉfi Fouille de Textes 2013 evaluation campaign, where they obtained the best overall results, ranking first on task 1 and second on task 2. In this paper, we explain our approach for building relevant and effective systems dealing with such a corpus.
منابع مشابه
A Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملTopic Modeling and Classification of Cyberspace Papers Using Text Mining
The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...
متن کاملA Study on the Combined Approach of Sentiment Classification Based on Ontology
The text documents contain opinions or sentiments on some objects, such as movie reviews, book reviews, product reviews etc. Sentiment analysis is mining the sentiment or opinion words and identification or analysis of the opinion and arguments in text. Here this paper proposed an ontology based combination approach to improve the exits approaches of sentiment classifications and to use supervi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Informatics
دوره 1 شماره
صفحات -
تاریخ انتشار 2014